Eigenvoice conversion based on Gaussian mixture model
نویسندگان
چکیده
This paper describes a novel framework of voice conversion (VC). We call it eigenvoice conversion (EVC). We apply EVC to the conversion from a source speaker’s voice to arbitrary target speakers’ voices. Using multiple parallel data sets consisting of utterancepairs of the source and multiple pre-stored target speakers, a canonical eigenvoice GMM (EV-GMM) is trained in advance. That conversion model enables us to flexibly control the speaker individuality of the converted speech by manually setting weight parameters. In addition, the optimum weight set for a specific target speaker is estimated using only speech data of the target speaker without any linguistic restrictions. We evaluate the performance of EVC by a spectral distortion measure. Experimental results demonstrate that EVC works very well even if we use only a few utterances of the target speaker for the weight estimation.
منابع مشابه
Maximum a posteriori adaptation for many-to-one eigenvoice conversion
Many-to-one eigenvoice conversion (EVC) allows the conversion from an arbitrary speaker’s voice into the pre-determined target speaker’s voice. In this method, a canonical eigenvoice Gaussian mixture model is effectively adapted to any source speaker using only a few utterances as the adaptation data. In this paper, we propose a many-to-one EVC based on maximum a posteriori (MAP) adaptation for...
متن کاملSpeaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model
One-to-many eigenvoice conversion (EVC) allows the conversion of a specific source speaker into arbitrary target speakers. Eigenvoice Gaussian mixture model (EV-GMM) is trained in advance with multiple parallel data sets consisting of the source speaker and many pre-stored target speakers. The EV-GMM is adapted for arbitrary target speakers using only a few utterances by estimating a small numb...
متن کاملEsophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
This paper presents a novel method of enhancing esophageal speech using statistical voice conversion. Esophageal speech is one of the alternative speaking methods for laryngectomees. Although it doesn’t require any external devices, generated voices usually sound unnatural compared with normal speech. To improve the intelligibility and naturalness of esophageal speech, we propose a voice conver...
متن کاملOne-to-Many Voice Conversion Based on Tensor Representation of Speaker Space
This paper describes a novel approach to flexible control of speaker characteristics using tensor representation of speaker space. In voice conversion studies, realization of conversion from/to an arbitrary speaker’s voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice Gaussian mixture model (EV-GMM) was proposed. In the EVC, similarly t...
متن کاملEffects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion
This paper introduces speaker adaptive training techniques to tensor-based arbitrary speaker conversion. In voice conversion studies, realization of conversion from/to an arbitrary speaker’s voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC), which is based on an eigenvoice Gaussian mixture model (EV-GMM), was proposed. Although the EVC can effectively const...
متن کاملMon.O1d.06 Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion
This paper introduces speaker adaptive training techniques to tensor-based arbitrary speaker conversion. In voice conversion studies, realization of conversion from/to an arbitrary speaker’s voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC), which is based on an eigenvoice Gaussian mixture model (EV-GMM), was proposed. Although the EVC can effectively const...
متن کامل